{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "faa79f3e-38bf-4336-8761-f8cd1453e870",
   "metadata": {},
   "source": [
    "# Using LLMs hosted on NVIDIA API Catalog \n",
    "\n",
    "This guide teaches you how to use NeMo Guardrails with LLMs hosted on NVIDIA API Catalog. It uses the [ABC Bot configuration](../../../../examples/bots/abc) and with the `meta/llama-3.1-70b-instruct` model. Similarly, you can use `meta/llama-3.1-405b-instruct`, `meta/llama-3.1-8b-instruct` or any other [AI Foundation Model](https://build.nvidia.com/explore/discover).\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "2ab1bd2c-2142-4e65-ad69-b2208b9f6926",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2024-07-24T20:07:24.986860Z",
     "start_time": "2024-07-24T20:07:24.826720Z"
    }
   },
   "outputs": [],
   "source": [
    "# Init: remove any existing configuration\n",
    "!rm -r config\n",
    "\n",
    "# Get rid of the TOKENIZERS_PARALLELISM warning\n",
    "import warnings\n",
    "\n",
    "warnings.filterwarnings(\"ignore\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "bf619d8e-7b97-4f3d-bc81-4d845594330e",
   "metadata": {},
   "source": [
    "## Prerequisites\n",
    "\n",
    "Before you begin, ensure you have the following prerequisites in place:\n",
    "\n",
    "1. Install the [langchain-nvidia-ai-endpoints](https://github.com/langchain-ai/langchain-nvidia/tree/main/libs/ai-endpoints) package:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "0abf75be-95a2-45f0-a300-d10381f7dea5",
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "!pip install -U --quiet langchain-nvidia-ai-endpoints"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "573aa13e-e907-4ec2-aca1-6b56e2bea2ea",
   "metadata": {},
   "source": [
    "2. An NVIDIA NGC account to access AI Foundation Models. To create a free account go to [NVIDIA NGC website](https://ngc.nvidia.com/).\n",
    "\n",
    "3. An API key from NVIDIA API Catalog:\n",
    "   - Generate an API key by navigating to the [AI Foundation Models](https://build.nvidia.com/explore/discover) section on the NVIDIA NGC website, selecting a model with an API endpoint, and generating an API key. You can use this API key for all models available in the NVIDIA API Catalog.\n",
    "   - Export the NVIDIA API key as an environment variable:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "dda7cdffdcaf47b6",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2024-07-24T20:07:27.353287Z",
     "start_time": "2024-07-24T20:07:27.235295Z"
    },
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "!export NVIDIA_API_KEY=$NVIDIA_API_KEY # Replace with your own key"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9a251dfe-6058-417f-9f9b-a71697e9e38f",
   "metadata": {},
   "source": [
    "4. If you're running this inside a notebook, patch the AsyncIO loop."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "bb13954b-7eb0-4f0c-a98a-48ca86809bc6",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2024-07-24T20:07:27.360147Z",
     "start_time": "2024-07-24T20:07:27.355529Z"
    }
   },
   "outputs": [],
   "source": [
    "import nest_asyncio\n",
    "\n",
    "nest_asyncio.apply()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6bf3af12-b487-435c-938b-579bb786a7f0",
   "metadata": {},
   "source": [
    "## Configuration\n",
    "\n",
    "To get started, copy the ABC bot configuration into a subdirectory called `config`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "69429851b10742a2",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2024-07-24T20:07:27.494286Z",
     "start_time": "2024-07-24T20:07:27.361039Z"
    },
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "!cp -r ../../../../examples/bots/abc config"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b98abee4-e727-41b8-9eed-4c536d2d072e",
   "metadata": {
    "jp-MarkdownHeadingCollapsed": true
   },
   "source": [
    "Update the `models` section of the `config.yml` file to the desired model supported by NVIDIA API Catalog:\n",
    "\n",
    "```yaml\n",
    "...\n",
    "models:\n",
    "  - type: main\n",
    "    engine: nvidia_ai_endpoints\n",
    "    model: meta/llama-3.1-70b-instruct\n",
    "...\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "525b4828f87104dc",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2024-07-24T20:07:27.500146Z",
     "start_time": "2024-07-24T20:07:27.495580Z"
    },
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "# Hide from documentation page.\n",
    "with open(\"config/config.yml\") as f:\n",
    "    content = f.read()\n",
    "\n",
    "content = content.replace(\n",
    "    \"\"\"\n",
    "  - type: main\n",
    "    engine: openai\n",
    "    model: gpt-3.5-turbo-instruct\"\"\",\n",
    "    \"\"\"\n",
    "  - type: main\n",
    "    engine: nvidia_ai_endpoints\n",
    "    model: meta/llama-3.1-70b-instruct\"\"\",\n",
    ")\n",
    "\n",
    "with open(\"config/config.yml\", \"w\") as f:\n",
    "    f.write(content)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b14e9279-a535-429a-91d3-805c8e146daa",
   "metadata": {},
   "source": [
    "## Usage \n",
    "\n",
    "Load the guardrail configuration:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "b332cafe-76e0-448d-ba3b-d8aa21ed66b4",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2024-07-24T20:07:30.383863Z",
     "start_time": "2024-07-24T20:07:27.501109Z"
    }
   },
   "outputs": [
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "820b167bcde040b1978fbe6d29c2d819",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Fetching 8 files:   0%|          | 0/8 [00:00<?, ?it/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "from nemoguardrails import LLMRails, RailsConfig\n",
    "\n",
    "config = RailsConfig.from_path(\"./config\")\n",
    "rails = LLMRails(config)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d4d9f276-5374-4504-ac4b-1f0fc86421fe",
   "metadata": {},
   "source": [
    "Test that it works: "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "8caba345-3363-4bc5-9c47-3b5bb92cefe4",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2024-07-24T20:07:34.476598Z",
     "start_time": "2024-07-24T20:07:30.384594Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "According to our company policy, you are eligible for 20 days of vacation per year, accrued monthly.\n"
     ]
    }
   ],
   "source": [
    "response = rails.generate(messages=[{\"role\": \"user\", \"content\": \"How many vacation days do I have per year?\"}])\n",
    "print(response[\"content\"])"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "db40602e4bcfefa8",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "You can see that the bot responds correctly. "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ccc159fb65dde756",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "## Conclusion\n",
    "\n",
    "In this guide, you learned how to connect a NeMo Guardrails configuration to an NVIDIA API Catalog LLM model. This guide uses `meta/llama-3.1-70b-instruct`, however, you can connect any other model by following the same steps. "
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.12"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
